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ABSTRACT 

A lot of research activity has recently taken place around the 
chase procedure, due to its usefulness in data integration, 
data exchange, query optimization, peer data exchange and 
data correspondence, to mention a few. As the chase has 
been investigated and further developed by a number of re- 
search groups and authors, many variants of the chase have 
emerged and associated results obtained. Due to the hetero- 
geneous nature of the area it is frequently difficult to verify 
the scope of each result. In this paper we take closer look 
at recent developments, and provide additional results. Our 
analysis allows us create a taxonomy of the chase variations 
and the properties they satisfy. 

Two of the most central problems regarding the chase is 
termination, and discovery of restricted classes of sets of 
dependencies that guarantee termination of the chase. The 
search for the restricted classes has been motivated by a 
fairly recent result that shows that it is undecidable to deter- 
mine whether the chase with a given dependency set will ter- 
minate on a given instance. There is a small dissonance here, 
since the quest has been for classes of sets of dependencies 
guaranteeing termination of the chase on all instances, even 
though the latter problem was not known to be undecidable. 
We resolve the dissonance in this paper by showing that de- 
termining whether the chase with a given set of dependen- 
cies terminates on all instances is coRE-complete. Our re- 
duction also gives us the aforementioned instance-dependent 
RE-completeness result as a byproduct. For one of the re- 
stricted classes, the stratified sets dependencies, we provide 
new complexity results for the problem of testing whether a 
given set of dependencies belongs to it. These results rectify 
some previous claims that have occurred in the literature. 
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1. INTRODUCTION 



The chase procedure was initially developed for test- 
ing logical implication between sets of dependencies [s] , 
for determining equivalence of database instances known 

and for de- 
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termining query equivalence under database constraints 
. Recently the chase has experienced a revival due to 
its application in data integration, data exchange, data 
repair, query optimization, ontologies and data corre- 
spondence. In this paper we will focus on constraints 
in the form of embedded dependencies [S] specified by 
sets of tuple generating dependencies (tgd's). A tgd is 
a first order formula of the form 

Vx, y (a(x, y) ^ 3z f3{x,z)), 

where a and /3 are conjunctions of relational atoms, and 
X, y, and z are sequences of variables. We refer to a as 
the body and /3 as the head of the dependency. Some- 
times, for simplicity, the tgd is written as a -> /3. Intu- 
itively the chase procedure repeatedly applies a series 
of chase steps to database instances that violate some 
dependency. Each such chase step takes a tgd that is 
not satisfied by the instance, a set of tuples that wit- 
ness the violation, and adds new tuples to the database 
instance so that the resulting instance does satisfy the 
tgd with respect to those witnessing tuples. 

Given an instance / and a set of tgd's E, a model of 
/ and E is a database instance J such that there is a 
honiomorphism from / to J, and J satisfies E. A uni- 
versal model of / and E is a finite model of / and E 
that has a homomorphism into every model of / and E. 
It was shown in [9j [6] that the chase computes a uni- 
versal model of / and S, whenever / and E has one. In 
case / and E does not have a universal model the chase 
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doesn't terminate (in this case it actually converges at 
a countably infinite model). 

As the research on the chase has progressed several 
variations of the chase have evolved. As a consequence 
it has become difficult to determine the scope of the re- 
sults obtained. We scrutinize the four most important 
chase variations, both deterministic and non-deterministic. 
We will check for each of these chase variations the data 
and combine complexity of testing if the chase step is 
applicable for a given instance and tgd. It didn't came 
as a surprise to find out that the oblivious and semi- 
oblivious chase variation share the same complexity, 
but the standard chase has a slightly higher complexity. 
The table below shows the data and combined complex- 
ity for the following problem: given an instance with n 
tuples and a tgd {a f3), is the standard/oblivious 
chase step applicable? 



actually guaranteeing termination for the less expensive 
semi-oblivious chase variation. 

It has been known for some time [61 |4l ITtI that it is 
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Thus, at a first look the oblivious and semi-oblivious 
chase procedures will be a more appropriate choice when 
it comes to a practical implementation. Still, as we will 
show, the lower complexity comes with a price, that is 
the higher the complexity for a chase variation the more 
" likely" is the chase process to terminate for a given in- 
stance and set of dependencies. Thus, the core chase, 
that not only applies in parallel all standard chase steps 
but it also computes the core of the resulted instance, 
has the highest complexity of the chase step. On the 
other hand from [6^ we know that the core chase is com- 
plete in finding universal models meaning that if any of 
the chase variations terminates for some input, then 
the core chase terminates as well. We next compare the 
semi-oblivious and standard chase when it comes to the 
termination problem. With this we show that the stan- 
dard and semi-oblivious chase are not distinguishable 
for the most classes of dependencies developed to ensure 
the standard chase termination. Furthermore, we show 
that the number of semi-oblivious chase steps needed to 
terminate remains the same as for the standard chase, 
namely polynomial. This raises the following question: 
What makes a class of dependency sets that terminate 
for all input instances under the standard chase to ter- 
minate for the semi-oblivious chase as well? We answer 
this question by giving a sufficient syntactical condition 
for classes of dependency sets that ensures termination 
on all instances for the standard chase to also guarantee 
termination for the semi-oblivious chase. As we will see 
most of the known classes of dependencies build to en- 
sure the standard chase termination on all instances are 



undecidable to determine if the chase with a given set of 
tgd's terminates on a given instance. This has spurred a 
quest for restricted classes of tgd's guaranteeing termi- 
nation. Interestingly, these classes all guarantee uni- 
form termination, that termination on all instances. 
This, even though it was only known that the prob- 
lem is undecidable for a given instance. We remediate 
the situation by proving that (perhaps not too surpris- 
ingly) the uniform version of the termination problem is 
undecidable as well, and show that it is not recursively 
enumerable. We show that the determining whether the 
core chase with a given set of dependencies terminates 
on all instances is coRE-complete. We achieve this us- 
ing a reduction from the uniform termination problem 
for word- rewriting systems (semi-Thue systems). As a 
byproduct we obtain the result from 'gI showing that 
testing if the core chase terminates for a given instance 
and a given set of dependencies is RE-complete. We 
will show also that the same complexity result holds 
for testing whether the standard chase with a set of 
dependencies is terminating on at least one execution 
branch. Next we will show that by using a single denial 
constraint (a "headless" tgd) in our reduction the same 
complexity result holds also for the standard chase ter- 
mination on all instances on all execution branches. It 
remains an open problem if this holds without denial 
constraints. 

Many of the restricted classes guaranteeing termina- 
tion rely on the notion of a set E of dependencies being 
stratified. Stratification involves two conditions, one de- 
termining a partial order between tgd's in in S, and the 
other on E as a whole. It has been claimed that test- 
ing the partial order between tgd's is in NP (6). We 
show that this cannot be the case (unless NP=coNP), 
by proving that the problem is at least coNP-hard. We 
also prove a A2 upper bound for the problem. Find- 
ing matching upper and lower bounds remains an open 
problem. 

Paper outline 

The next section contains the preliminaries and describes 
the chase procedure and its variation. Section 3 deals 
with the complexity of testing if for an instance and 
a dependency there exists an "applicable" chase step. 
Section 4 deals with problems related to the chase ter- 
mination. We define termination classes for each of 
the chase variations and then determine the relation- 
ship between these classes. Section 4 also contains our 
main result, namely, that it is coRE-complete to test 
if the chase variations with a given set of dependen- 
cies terminate on all instances. This result is obtained 
via a reduction from the uniform termination problem 
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for word-rewriting systems. In Section 5 we review the 
main restricted classes that ensure termination on aU 
instances, and relate them to different chase variations 
and their termination classes. Finally, in Section 6 we 
provide complexity results related to the membership 
problem for the stratification based classes of dependen- 
cies that ensure the standard chase termination. Con- 
clusions and further work are drawn in the last section. 
Proofs not given in the paper are included in an Ap- 
pendix. 



2. PRELIMINARIES 

For basic definitions and concepts we refer to [I] . We 
will consider the complexity classes PTIME, NP, coNP, 
DP, RE, coRE, and the first few levels of the polynomial 
hierarchy. For the definitions of these classes we refer 
to 
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We start with some preliminary notions. We will use 
the symbol £ for the subset relation, and c for proper 
subset. A function / with a finite set {xi, . . . ,Xn} as 
domain, and where f{xi) = a^, will be described as 
{xi/ai, . . . ,x„/a„}. The reader is cautioned that the 
symbol -> will be overloaded; the meaning should how- 
ever be clear from the context. 

Relational schemas and instances. A relational 
schema is a finite set R = . . . , i?„} of relational 
symbols Ri, each with an associated positive integer 
arity{Ri). Let Cons be a countably infinite set of con- 
stants, usually denoted a,b,c, . . ., possibly subscripted, 
and let Nulls be a countably infinite set of nulls denoted 
x,yi,y2, .... A relational instance over a schema R is 
a function that associates for each relational symbol 
i? 6 R a finite subset of (Cons u Nulls)''"*^'^^). 

For notational convenience we shall frequently iden- 
tify an instance / with the set {R{a) : (a) e R^ , R e R} 
of atoms, assuming appropriate lengths of the sequence 
a for each i? e R. By the same convenience, the atoms 
R{ai . . . , ak) will be called tuples of relation R^ and de- 
noted t,ti,t2, ■ ■ ■■ By dom{I) we mean the set of all 
constants and nulls occurring in the instance /, and by 
|/| we mean the number of tuples in /. 

Homomorphisms. Let / and J be instances, and h 
a mapping from dom{I) to dom(J) that is the iden- 
tity on the constants. We extend h to tuples (a) = 
(ai,...,afc) by /i(ai, . . . , Ofe) = (/i(ai), . . . , /i(afc)). By 
our notational convenience we can thus write h{a) as 
h{R{a)), when (a) e R^ . We extend h to instances by 
h{I) = {h{t) ■■ t e I}. If h{I) c J we say that h is a 
homomorphism from / to J. If h{I) £ /, we say that h 
is an endomorphism, and if h also is idempotent, h is 
called a retraction. If h{I) c J ^ and the mapping h is a, 



bijection, and if also h^^{J) = I, the two instances are 
isomorphic. 

A subset /' of / is said to be a core of /, if there is 
a endomorphism h, such that h{I) £ /', and there is no 
endomorphism g such that g{I') c I' . It is well known 
that all cores of an instance / are isomorphic, so for our 
purposes we can consider the core unique, and denote 
it core{I). 

Tuple generating dependencies. A tuple generating 
dependency (tgd) is a first order formula of the form 

yx,y(a{x,y) 3z f3{x,z)), 

where a and /3 are conjunctions of relational atoms, and 
x,y and z are sequences of variables. We assume that 
the variables occurring in tgd's come from a countably 
infinite set Vars disjoint from Nulls. We also allow con- 
stants in the tgd's. In the formula we call a the body of 
the tgd. Similarly we refer to as the head of the tgd. 
If there are no existentially quantified variables the de- 
pendency is said to be full. 

When a is the body of a tgd and h a mapping from 
the set Vars u Const to Nulls u Const that is identity on 
constants, we shall conveniently regard the set of atoms 
in a as an instance /„, and write h{a) for the set h{Ia). 
Then ft, is a homomorphism from a to an instance /, if 
h{a) c /. 

Frequently, we omit the universal quantifiers in tgd 
formulas. Also, when the variables and constants are 
not relevant in the context, we denote a tuple generating 
dependency a(x,y) 3z j3{x,z) simply as 

Let ^ = a ^ /3 be a tuple generating dependency, and 
/ be an instance. Then we say that / satisfies ^, if / 1= ^ 
in the standard model theoretic sense, or equivalently, 
if for every homomorphism /i, such that h{a) £ J, there 
is an extension h' of ft,, such that ft'(/3) E /. 

The Chase. Let E be a (finite) set of tgd's and / an 
instance. A trigger for the set S on / is a pair (^, ft), 
where ^ = a ^ P € S, and ft is a homomorphism such 
that h{a) £ /. If^ in addition, there is no extension ft' 
of ft, such that ft'(/3) £ /, we say that the trigger (^,ft) 
is active on /. 

Let (^, ft) be a trigger for S on /. To fire the trigger 
means transforming / into the instance J = /u{ft'(/3)}, 
where ft' is a distinct extension of ft, i.e. an extension 
of ft that assigns new fresh nulls to the existential vari- 
ables in /3. By "new fresh" we mean the next unused 
element in some fixed enumeration of the nulls. We de- 



note this transformation / 



J, or just / J, if the 



particular trigger is irrelevant or understood from the 
context. 

A sequence Iq, Ii, I2 ■ . . of instances (finite or infinite) 
is said to be a chase sequence with E originating from 
Iq, if /j for all i = 0, 1, 2, . . .. At each step there 
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can naturally be several triggers to choose from, so in 
general there will be several chase sequences originating 
from Iq for any given set S of tgd's. If for some i that 
is there are no more triggers to be fired for 7^, we say 
that the sequence terminates. Otherwise the sequence 
is infinite. 

In summary, the chase process can be seen as a tree 
rooted at /q, and with the individual chase sequences 
as branches. From an algorithmic point of view the 
choice of the next trigger to fire is essential. Based on 
this, the following variations of the chase process have 
been considered in the literature, (for a comprehensive 
review of different chase variation see [22]). 

1. The standard chase |9]. The next trigger is cho- 
sen nondeterministically from the subset of current 
triggers that are active. 

2. The oblivious chase [4]. The next trigger is cho- 
sen nondeterministically from the set of all current 
triggers, active or not, but each trigger is fired only 
once in a chase sequence. 

3. The semi- oblivious chase [17) . Let f be a tgd 
a{x,y) I3{x,z). Then triggers and {^,g) 
are considered equivalent if h{x) = g{x). The 
semi-oblivious chase works as the oblivious one, 
except that exactly one trigger from each equiva- 
lence class is fired in a branch. 

4. The core chase [6]. At each step, all currently ac- 
tive triggers are fired in parallel, and then the core 
of the union of the resulting instances is computed 
before the next step. Note that this makes the 
chase process deterministic. 

The three first variations are all nondeterministic, but 
differ in which triggers they fire. Also we consider all 
chase procedures to be /air, meaning that in any infinite 
chase sequence, if a trigger is applicable at some chase 
step i, then there exists an integer j > i such that the 
trigger is fired at step j. 

To illustrate the difference between these chase vari- 
ations, consider dependency set S = {^}, where ^ is tgd 
R{x,y) 3z S{x,z), and instance: 

Iq 
R{a,b) 
R{a, c) 
S{a, d) 

There are two triggers for the set E on instance /q: 
namely {£,,{x/a,y/b}) and (^, {x/a, y/c}). Since Iq i= £, 
neither of the triggers is active, so the standard chase 
will terminate at Iq- In contrast, both the oblivious and 
semi-oblivious chase will fire the first trigger, resulting 



in instance /i = Iou{S{a, zi)}. The semi-oblivious chase 
will terminate at this point, while the oblivious chase 
will fire the second trigger, and then terminate in I2 = 
/iu{5(a, 22)}- The core chase in this case will terminate 
also with Iq. 



3. COMPLEXITY OF THE CHASE STEP 

Algorithmically, there are two problems to consider. 
For knowing when to terminate the chase, we need to 
determine whether for a given instance / and tgd ^ there 
exists a homomorphism h such that (^, h) is trigger on 
/. This pertains to the oblivious and semi-oblivious 
variations. The second problem pertains to the stan- 
dard and core chase: given an instance / and a tgd ^, is 
there a homomorphism h, such that (^, h) is an active 
trigger on /. We call these problems the trigger exis- 
tence problem, and the active trigger existence prob- 
lem, respectively. The data complexity of these prob- 
lems considers f fixed, and in the combined complexity 
both / and ^ are part of the input. The following the- 
orem gives the combined and data complexities of the 
two problems. 

Theorem 1. Let ^ be a tgd and I an instance. Then 

1. For a fixed ^, testing whether there exists a trigger 
or an active trigger on a given I is polynomial. 

2. Testing whether there exists a trigger for a given ^ 
on a given I is HP -complete. 

3. Testing whether there exists an active trigger for a 
given ^ and a given I is Ti^-complete. 

Proof: (Sketch) The polynomial cases can be veri- 
fied by checking all homomorphisms from the body of 
the dependency into the instance. For the active trigger 
problem we also need to consider, for each such homo- 
morphism, if it has an extension that maps the head of 
the dependency into the instance. These tasks can be 
carried out in ©(ti'"') and 0(nl"l^l^l) time, respectively. 

It is easy to see that the trigger existence problem is 
N P-complete in combined complexity, as the problem is 
equivalent to testing whether there exists a homomor- 
phism between two instances (in our case a and /); a 
problem known to be N P-complete. 

For the combined complexity of the active trigger ex- 
istence problem, we observe that it is in Sf', since one 
may guess a homomorphism h from a into /, and then 
use an NP oracle to verify that there is no extension h' 
of h, such that /i'(/3) £ /. For the lower bound we will 
reduce the following problem to the active trigger exis- 
tence problem [24]. Let (j}{x,y) be a Boolean formula 
in 3CNF over the variables in x and y. Is the formula 

3x -^{3y(p{x,y)) 
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true? The problem is a variation of the standard 3V- 
QBF problem [25l. 

For the reduction, let (j> be given. We construct an 
instance and a tgd f^. The instance is as follows: 



F 



N 



1 
1 
1 
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1 



1 1 
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The tgd ^0 = a ^ /? is constructed as follows. For each 
variable a: e a; in (j){x,y), the body a will contain the 
atom N{x,x') {x' is used to represent The head 

/3 is existentially quantified over that set Uy6y{2/,2/'} of 
variables. For each conjunct C of 0, we place in /3 an 
atom F(x,y,z), where x,y and z are the variables in 
C, with the convention that if variable x is negated in 
C, then x' is used in the atom. Finally for each y & y, 
we place in (3 the atom N{y,y'), denoting that y and 
y' should not have the same truth assignment. 

Let us now suppose that the formula 3x -•{3y (j){x, y)) 
is true. This means that there is a {0, l}-valuation 
h oi X such that for any {0, 1} valuation h' of y, the 
formula (j){h{x) , h' (y)) is false. It is easy to see that 
h{a) £ I. Also, since (f>{h{x),h'{y)) is false for any val- 
uation h' , for each h' there must be an atom F{x, y, z) e 
/?, such that h' o h{F{x,y, z)) is false, that is either 
h' o h{F{x, y,z)) = F{0, 0, 0) ^ or h' assigns for some 
existentially quantified variables non boolean values. 
Consequently the trigger is active on I^. 

For the other direction, suppose that there exists a 
trigger {^,h) which is active on i.e., h{a) £ and 
h'{f3) <f: I, for any extension h' of h. This means that for 
any such extension h' , either h' is not {0, l}-valuation, 
or that the atom F(0, 0, 0) is in h'{(3). Thus the formula 
3x -^(^3y (j){x , y)) is true. ■ 

Note that the trigger existence relates to the oblivious 
and semi-oblivious chase variations, whereas the active 
trigger existence relates to the standard and the core 
chase. This means that the oblivious and semi-oblivious 
chase variations have the same complexity. This is not 
the case for the standard and the core chase, because 
the core chase step applies all active triggers in parallel 
and also involves the core computation for the resulted 
instance. In 10 it is shown that computing the core 



involves a DP-complete decision problem. 



The latter problem is undecidable in general, as we will 
see in the second subsection. However, the several chase 
variations have different termination behavior, so next 
we introduce some notions that will help to distinguish 
them. 

4.1 Termination classes 

Let * € {std, obi, sobi, core}, corresponding to the chase 
variations introduced in Section [2] and let E be a set 
of tgd's. If there exists a terminating *-chase sequence 
with S on /, we say that the *-chase terminates for some 
branch on instance /, and denote this as S € CTj 3 . 
Here CTj 3 is thus to be understood as the class of all 
sets of tgd's for which the *-chase terminates on some 
branch on instance /. Likewise, CTj y denotes the class 
of all sets of tgd's for which the *-chase with E on J 
terminates on all branches. From the definition of the 
chase variations it is easy to observe that any trigger 
applicable by the standard chase step on an instance 
/ is also applicable by the semi-oblivious and oblivi- 
ous chase steps on the same instance. Similarly all the 
triggers applicable by the semi-oblivious chase step on 
an instance / is also applicable by the oblivious chase 
step on instance /. Note that the converse is not al- 
ways true. Thus CT^''^ £ CT™^' £ CT^^. It is also 
easy to verify that CTj ^ = CTj y for * e {obl,sobl}, 
that Cr/y £ CTf 3, and that CT°y £ CTfy. The 
following propositions shows that these results can be 
strengthened by including strict inclusions: 

Proposition 1. For any instance I we have: 



CT 



obi 



(— pobi /— ps( 



nsobl 



/ — i-sobi / — i-5td f — i-std 
^ ' 7.3 ^ ' /.V ^ ' /,3 • 



Proof: (Sketch) For strict inclusion Cjfl c CV°y 
consider instance / = {R(a)} and E containing a sin- 
gle (tautological) dependency R{x) -> 3y R{y). It is 
easy to see that E e CTj°y but E i CT°'^3. For the sec- 
ond strict inclusion CTj°3 c CT^y, consider instance 
/ = {S{a,a)} and E = {S{x,y) 3z S{y,z)}. Be- 
cause / 1= E it follows that E e CTfy. On the other 
hand, the semi-oblivious chase will not terminate with 
E on /, and thus E i CT^°j. For the final strict in- 
clusion CTft c Cr/jj, let / = {S{a,b),R{a)} and 
Tt = {S{x,y) ^ 3z S{y, z); R{x) ^ S{x,x)}. It is easy 
to see that any standard chase sequence that starts by 
firing the trigger based on the first tgd will not termi- 
nate, whereas if we first fire the trigger based on the 
second tgd, the standard chase will terminate after one 
step. B 



4. CHASE TERMINATION QUESTIONS 

Being able to decide whether a chase should be ter- 
minated at a given step in the sequence does not mean 
that we can decide whether the case ever will terminate. 



The next question is whether a *-chase terminates 
on all instances for all or for some branches. The cor- 
responding classes of sets of tgd's are denoted CTyy 



and CTy3, respectively. Obviously CT 



std 

vv 



CT 



std 



5 



Similarly to the instance dependent termination classes, 
CTyy c CTy°y' and CTyy = CTyg, for * € {obl,sobl}. 

Now can relate the oblivious, semi-oblivious and stan- 
dard chase termination classes as follows: 



Proposition 3. 



1. c CTj°y , for any instance I. 

2. cv^i c cryT. 

Proof: (Sketch) To see that the inclusion in part 2 of 
the proposition is strict, let S = {R{x) 3z R{z)S{x)}, 
and Iq = {R{a)}. In this setting there will be exactly 
one active trigger at each step, and the algorithm will 
converge only at the infinite instance 

\J{R{z,),S{z,-i)}u{R{a),S{a),R{zi)}. 

i>l 



Theorem 2. 



CT 



obi 



f — i-obi f — i-sobi _ f — rsobi f — rStd f — i-std 



The proof of this theorem is included in the Ap- 
pendix. 

Note that for any * e {std, obi, sobi, core}, and for any 
non-empty instance /, we have that CTyy c CT*j y and 

CTyg C CT^ 3. 

The termination of the oblivious chase can be related 
to the termination of the standard chase by using the 
enric/imeni transformation, introduced in The en- 
richment takes a tgd ^ = a{x,y) 3z P(x,z) over 
schema R and converts it into tuple generating depen- 
dency ^ = a{x,y) ->■ 3z l3{x, z), H{x,y), where H ia a 
new relational symbol that does not appear in R. For 
a set S of tgd's the transformed set is E = : ^ e S}. 
Using the enrichment notion the following was shown. 



Theorem 3. 11 S e CTyy if and only if S e CTyy . 



To relate the termination of the semi-oblivious chase 
to the standard chase termination, we use a transforma- 
tion similar to the enrichment. This transformation is 
called semi- enrichment and takes a tuple generating de- 
pendency 5 = a{x, y) -s- 3z l3{x, z) over a schema R and 
converts it into the tgd ^ = a{x,y) 3z (3{x,z),H{x), 
where ff is a new relational symbol which does not ap- 
pear in R. For a set £ of tgd's defined on schema R, 
the transformed set is E = : ^ e E}. Using the semi- 
enrichment notion, the standard and the semi-oblivious 
chase can be related as follows. 

Theorem 4. E e CTyy if and only if E e CTyy . 

Proof. Similar to the proof of Theorem |3] ■ 

We now turn our attention to the core chase. Note 
the core chase is deterministic since all active triggers 
are fired in parallel, before taking the core of the result. 
Thus we have: 

Proposition 2. CTj^ = CT°^' and CT"y7 = CT^°^\ 

It is well known that all here considered chase vari- 
ations compute a (finite) universal model of / and E 
when they terminate (9) [6[ m [l7|. In [6], Deutsch et 
al. showed that if / u E has a universal model, the core 
chase will terminate in an instance that is the core of 
all universal models. We thus have 



CTyy and E ' 



Cry'^. Note 



From this, it follows that E 
that for any positive integer i, the core of the instance 
li is {R{a), S{a)}. Thus the core chase will terminate 
at instance Ii = {R{a), S{a)}. ■ 

The following Corollary highlights the relationship 
between the termination classes. 

Corollary 1. CT°^J c CT=y°y^' c CT=y*^ c cr;^ c CT'y^. 

4.2 Undecidability of termination 

It has been known for some time that "chase termina- 
tion is undecidable." Specifically, the following results 
have been obtained in the literature so far. 

Theorem 5. 

1. CTjy and CTj 3 are RE-complete 

2. CTj°y = CTf'j, and both sets are RE-complete j6j. 

3. CTj°y = CT^°^\ and both sets are RE-complete [l7| . 

4- Let E &e a set of guarded tgd's [4j. Then the ques- 
tion E 6 CTj°y is decidable Il3| . 



Our aim is to provide a systematic overview on the 
complexity of all termination classes. We shall first 
show that CTy°y^ is coRE-complete. To achieve this, 
we provide a uniform reduction for both the CTj°y^ and 
CTy°y^ problems, thus reproving Theorem [sj part 2 as a 
side effect. We also note that the proofs in ^ rely on 



machine reductions, and that Marnette observes in 18 



that a proof using a higher level problem, such as Post's 
correspondence problem, is still lacking. We fill this 
gap using word-rewriting systems as redact. A word- 
rewriting system, also known as semi-Thue systems, is 
a set of rules of the form ^ ->• r, where £ and r are words 
over a finite alphabet A. Let u and v be words in A*. 
The u can be derived from v if u = xly and v = xry, for 
some a;, y 6 A* and rule £ ^ r. A word rewriting system 
is terminating for a word w if the derivation closure of 
w is finite. The system is uniformly terminating if it is 
terminating for all words w e A* . We will prove that 
for every word rewriting system and word there is a set 
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of dependencies and an instance such that the rewrit- 
ing system is terminating for that word if and only if 
the core chase with the corresponding dependencies on 
the corresponding instance terminates. We also prove 
that if the core chase with the corresponding dependen- 
cies is infinite on all instances, then the rewriting sys- 
tem is uniformly terminating. It has long been known 
that testing if a word-rewriting system terminates for 
a given input word is RE-complete [s], and that testing 
if a word-rewriting system is uniformly terminating is 



coRE-complete 15 



The next theorem states the main result of this sec- 
tion. 



Theorem 6. The membership problem for CT 
coRE-complete. 



vv 



The proof involves a development using a series of 
lemmas, and can be found in its entirety in the Ap- 
pendix. Our reduction from word rewriting systems 
also yields 

Corollary 2. The membership problem for CTf'^ 
is RE-complete (cf. [6|J. 

The same reduction from word-rewriting systems gives 
the following undecidability result too: 

Theorem 7. The membership problem for CTy^ is 
coRE- complete. 

Unfortunately the reduction used for the previous re- 
sults can't be used to show the undecidability of the 
CTyy, CTy°y' Or CTyy classes. To overcome this we 
have allow a single denial constraint, that is a tgd of 
the form a^l, which is satisfied by an instance / only 
if there is no homomorphism h, such that h{a) $ I. It 
is an open problem if the following result (or part of it) 
can be obtained without such constraints. 

Theorem 8. In case the set of dependencies may 
contain at least one denial constraint, the membership 
problems for CT^^'^ , CTy°y' and CTyy are coRE- complete. 



As a final observation we need to mention that the 



class "TOC" of mappings defined in 17 , for which ter- 



mination of the semi-oblivious class is proved to be RE- 



complete, is not the same with the class CT' 



sobi 

vv • 

sobl 



Also 



there is no direct reduction from TOC to CTy°y mem- 
bership problem, as former is defined only for sets of 
tgd's describing data exchange mappings, and the ques- 
tion is whether the chase terminates for all instances 
that are source instances for the data exchange setting. 



5. GUARANTEED TERMINATION 

To overcome the undecidability of chase termination, 
a flurry restricted classes of tgd's have been proposed 
in the literature. These classes have been put forth as 
subsets of CTyy, although at the time only CTj°y was 
known to be undecidable. In this section we review 
these restricted classes with the purpose of determining 
their overall structure and termination properties. 

Before reviewing these classes of sets of tgd's let us 
deflne two properties attached to such classes based on 
the enrichment and semi-enrichment rewritings defined 
in subsection l4.1l 

A class of sets of tgd's is said to be closed under 
enrichment if E € implies that S e Using this 
notation together with Theorem [3] gives us a sufficient 
condition for a class of dependencies to belong to CTyy : 



Proposition 4. Let' 



' £ CTyy such that is closed 



under enrichment. Then £ CTyy. 

Using this proposition we will reveal classes of depen- 
dencies that ensure termination for the oblivious chase. 
Similarly we define the notion of semi- enrichment clo- 
sure for classes of dependency sets. The semi-enrichment 
closure property gives a sufficient condition for the semi- 
oblivious chase termination. 

Proposition 5. Let £ CTyy such that ^ is closed 
under semi- enrichment. Then ^ £ CTy°y'. 

As we will see next, most of the known classes that 
ensure the standard chase termination are closed under 
semi-enrichment, and thus those classes actually guar- 
antee the semi-oblivious chase termination as well. As 
we saw in Section[3j the semi-oblivious chase has a lower 
complexity that the standard chase. 



Acyclicity based classes 

As full tgd's do not generate any new nulls during the 
chase, any sequence with a set of full tgd's will termi- 
nate since there only is a finite number of tuples that 
can be formed out of the elements of the domain of 
the initial instance. The cause of non-termination lies 
in the existentially quantified variables in the head of 
the dependencies. Most restricted classes thus rely on 
restricting the tgd's in a way that prevents these exis- 
tential variables to participate in any recursion. 

The class of weakly acyclic sets of tgd's [o] was one of 
the first restricted classes to be proposed. Consider 



Si 



{R{x,y) 3zS{z,x)}. 



Let (-R, 1) denote the first position in R, and (5', 2) 
the second position in S, and so on. In a chase step 
based on this dependency the values from position (R, 1) 
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get copied into the position (S*, 2), whereas the value 
in position {R, 1) "cause" the generation of a new nuh 
value in (S*, 1). This structure can been seen in the de- 
pendency graph of Si that has a "copy" edge from ver- 
tex {R, 1) to vertex (5,2), and a "generate" edge from 
vertex {R, 1) to vertex (5, 1). Note that the graph does 
not consider any edges from (i?, 2) because variable y 
does not contribute to the generated values. The chase 
will terminate since there is no recursion going through 
the (S, 2) position. By contrast, the dependency graph 
of 

E2 = {Rix,y)^3zRiy,z)} 

has a generating edge from (i?, 2) to (i?, 2). It is the 
generating self-loop at (i?, 2) which causes the chase on 
for example the instance {R{a,b)} to converge only at 
the infinite instance {R(a,b),R{b,zi)} u {R{zi, Zi+i) ■ 
i = 1,2,...}. The class of weakly acyclic tgd's (WA) 
is defined to be those sets of tgd's whose dependency 
graph doesn't have any cycles involving a generating 
edge [9]. It is easy to observe that the class (WA) is 
closed under semi-enrichment but it is not closed un- 
der enrichment. This is because in the case of semi- 
enrichment the new relational symbol H considered for 
each dependency contains only variables that appears 
both in the body and the head of the dependency, and 
the new H atoms appear only in the heads of the semi- 
enriched dependency. This means that the dependency 
graph for a semi-enriched set of WA tgd's will only add 
edges oriented into positions associated with the new 
relational symbol. The set E = {R{x,y) -* 3z R{x,z)} 
shows that this is not the case for enrichment as E e WA 
but E i WA. 

The slightly smaller class of sets of tgd's with strat- 
ified witness (SW) [7] was introduced around the same 
time as WA. An intermediate class, the richly acyclic 
tgd's (RA) was introduced in [mI in a different context 
and it was later shown in 
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that RA 6 CT°v. It can be 
easily verified that both classes SW and RA are closed 
under enrichment. The safe dependencies {SD 19 ), and 
the super-weakly acyclic (sWA [17J ones are both gen- 
eralizations of the WA class, and both are close under 
semi-enrichment . 

All of these classes have been proven to have PTIME 
membership tests, and have the following properties. 

Theorem 9. [?) |9) [19) [T7| [u] 

1. SW c RA c WA c SD c sWA. 

2. WA c crv'v; RA c CT°^J, andsWA c CT°^\ 



In order to complete the picture suggested by the 
previous theorem we need a few more results. Consider 



Clearly E3 e WA. Let Iq = {R{a,b)}, and consider a 

semi-oblivious chase sequence /q, /i, /2, It is easy to 

see that for any 7^, i > 0, there exists a (non-active) trig- 
ger (^, {a;/a, y/zj}), meaning that the oblivious chase 
will not terminate. Thus we have E3 
other hand, for the set 



CT°%. On the 



= {S{y),R{x,y) ^3z R{y,z)}, 

we have E4 e CT°y. Furthermore, E4 ^ WA, since the 
dependency graph of E4 will have a generating self- loop 
on vertex (i?, 2). This gives us 

Proposition 6. The classes WA and CT°y are in- 
comparable wrt inclusion. 

It was shown in [l9| that WA c SD and also that 
SDc CT=*v- We can now extend this result by showing 
that, similarly to the WA class, the following holds: 



Proposition 7. The classes SD and CT° 
comparable wrt inclusion. 



obi 



Proof. (Sketch) The proof consists of showing that 
E3 € SD \ CT°v, and that E5 e CT°v x SD, where 

E5 = {R{x,x) ^ 3y R{x,y)}. 

Details are omitted. ■ 

From the semi-enrichment closure of the WA and SD 
classes and Proposition [5] we get the following result. 

Proposition 8. WA e CTyv and SD e CV°y. Fur- 
thermore, for any instance I and any E e SD the semi- 
oblivious chase with T, on I terminates in time polyno- 
mial in the size of I. 

Note that the previous result follows directly also 
from a similar result for the class sWA 17 . Still, as 



shown by the following proposition, the super-weakly 
acyclic class does not include the class of dependencies 
that ensures termination for the oblivious chase varia- 
tion, nor does the inclusion hold in the other direction. 

Proposition 9. sWAandCT°y are incomparable wrt 
inclusion. 

Proof. (Sketch) We exhibit the super-weakly acyclic 
set E3 = {R{x,y) ^ 3z R{x,z)}. It is clear E3 i CTvv- 
For the converse, let 

Eg = {S{x),R{x,y) ^ 3z R{y,z)}. 

Then Eg i sWA, on the other hand it can be observed 
that the oblivious chase with Eg terminates on all in- 
stances. This is because tuples with new nulls cannot 
cause the dependency to fire, as these new nulls will 
never be present in relation S.^ 



{R{x, y) 3z R{x, z)}. 
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Stratification based classes 

Consider E7 = {^1,^2}, where 

^1 = R{x,x) ^ 3z S{x, z), and 
^2 = R{x,y),S{x,z) ^ R{z,x). 

In the dependency graph of E7 we wih have the cycle 
{R, 1) ^ {S,2) ^ {R, 1), and since (5*, 2) is an existen- 
tial position, the set S7 is not weakly acyclic. How- 
ever, it is easy to see that E7 e CTyy. It is also easily 
seen that if S is empty and R non-empty, then ^1 will 
"cause" ^2 to fire for every tuple in R. Let us denote 
this relationship by f 1 < ^2- On the other hand, there 
in no chase sequence in which a new null in {S,2) can 
be propagated back to a tuple in R and to fire a trigger 
based on ^1, and thus create an infinite loop. We denote 
this with 1^2 / i^i- In comparison, when chasing with 

Es = {R{x,y) ^ 3z R{z,x)}, 

the new null Zi in (zi,Zi_i) will propagate into tuple 
{zi+i,Zi), in an infinite regress. If we denote the tgd in 
Eg with ^, we conclude that ^ < ^. A formal definition 
of the < relation is given in the next section. 

The preceeding observations led Deutsch et al. [6] to 
define the class of stratified dependencies by considering 
the chase graph of a set E, where the individual tgd's 
in E are the vertexes and there is an edge from to 
^2 when S,i < £,2- A set E is then said to be stratified 
if the vertex-set of every cycle in the chase graph forms 
a weakly acyclic set. The class of all sets of stratified 
tgd's is denoted Str. In the previous example, E7 e Str, 
and Ss i Str. 



In 



19 



Meier et al. observed that Str ^ CTyy and 
that actually only Str c CTy^, and came up with a 
corrected definition of <, which yielded the corrected 
stratified class CStr of tgd's, for which they showed 



Theorem 10. 19 



CStr c CTyy, Str c CTyg, and CStr c Str. 



From the observation that the CStr class is closed 
under semi-enrichment and from Proposition[5]we have: 

Proposition 10. 



1. CStrc CT; 



Dbl 

vv • 

2. CStr and CJZ 



are incomparable wrt inclusion. 



Proof. (Sketch) For the second part we have E3 e 
CStr and E3 ^ CTyy. For the converse consider the 
dependency set Eg from the proof of Proposition [gja 



Meier et al. 20 further observed that the basic strat- 



For this they considered dependency set Eg = {^3,^4}, 
where 

6 = S{x),E{x,y) ^ E{y,x), and 

^4 = S{x),E{x,y) ^3z E{y,z),E{z,x). 

Here ^3 and ^4 belong to the same stratum according 
to the definition of CStr. Since new nulls in both {E, 1) 
and {E,2) can be caused by {E,l) and {E,2), there 
will be generating self-loops on these vertexes in the 
dependency graph. Hence Eg ^ CStr. On the other 
hand, it is easy to see that the number of new nulls 
that can be generated in the chase is bounded by the 
number of tuples in relation S in the initial instance. 



Consequently Eg e CT' 



■std 

vv 
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In order to avoid such false negatives, Meier et al 
gave an alternative definition of the < relation and of 
the chase graph. Both of theses definitions are however 
technically rather involved, and will not be repeated 
here. The new inductively restricted class, abbreviated 
IR, restricts each connected component in the modified 
chase graph to be in SD. In example above. Eg e IR. 

Meier et al. 20 also observed that IR only catches 
binary relationships ^1 < ^2- This could be generalized 
to a ternary relation < (^1)^2)^3)5 meaning that there 
exists a chase sequence such that firing ^1 will cause ^2 
to fire, and this in turn causes ^3 to fire. This will elim- 
inate those cases where ^1, ^2 and ^3 form a connected 
component in the (modified) chase graph, and yet there 
is no chase sequence that will fire ^1, ^2 and ^3 in this 
order. Thus the tree dependencies should not be in the 
same stratum. 

Similarly to the ternary extension, the < relation can 
be generalized to be k-ary. The resulting termination 
classes are denoted T[fc]. Thus T[2] = IR, and in general 
T[fc] c T[fc + 1] [20|. The main property is 



Theorem 11. [19] 
CStr cIR= T[2] c T[3] c - c T[fc] c - c CT 



■std 



To complete the picture, we have the following propo- 
sition based on the semi-oblivious closure for the T[k] 
hierarchy and Proposition [5] 

Proposition 11. 



1. T[k] c CT^yf 



2. T[fc] and CTyy are incomparable wrt inclusion. 

Before concluding this section need to mention that 
all the classes discussed here are closed under semi- 
enrichment, thus they ensure the termination for the 
less expensive semi-oblivious chase in a polynomial num- 
ber of steps, in the size of the input instance. 

The Hasse diagram in Figure 2 from the Appendix 
summarizes the stratification based classes and their 
termination properties. 



ification definition also catches some false negatives. 
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6. COMPLEXITY OF STRATIFICATION 

As we noted in Section [Sj all the acyclicity based 
classes have the property that testing whether a given 
set E belongs to it can be done in PTIME. The situation 
changes when we move to the stratified classes. The 
authors of claimed that testing if < ^2 is in NP 
for a given ^1 and ^2, thus resulting in Str having a 
coNP membership problem. We shall see in Theorem [T2] 
below that this cannot be the case, unless NP = coNP. 
We shall use the < order as it is defined for the CStr 
class. The results also hold for the Str class. First we 
need a formal definition. 

Definition 1. [19 Let ^1 and ^2 be tgd's. Then ^1 
precedes ^2, denoted < ^2. there exists an instance 
I and homomorphisms hi and /12 from the universal 
variables in ^1 and ^2, respectively, such that: 

(i) I \= h2{^2), and 

(a) I J using an oblivious chase step, and 

(iil) J^/l2(6)- 

Note that the pair in the previous definition 

denotes a trigger, not necessarily an active trigger, be- 
cause the chase step considered is the oblivious one. 
Intuitively, the instance / in the definition is a witness 
to the "causal" relationship between ^1 and ^2 (via /12), 
as h2{S) won't fire at /, but will fire once ^1 has been 
applied. The notion of stratum of E is as before, i.e. 
we build a chase graph consisting of a vertex for each 
tgd in E, and an edge from ^1 to ^2 if < ^2- Then ^1 
and ^2 are in the same stratum when they both belong 
to the same cycle in the chase graph of E. A set E of 
tgd's is said to be C-stratified (CStr) if all its strata are 
weakly acyclic 19 . 



Theorem 12. 

1. Given two tgd's ^1 and ^2, the problem of testing 
if < ^2 is coNP-hard. 

2. Given a set of dependencies E, the problem of test- 
ing if Tie CStr is NP-hard. 

Proof: For part 1 of the theorem we will use a re- 
duction from the graph 3-colorability problem that is 
known to be NP-complete. It is also well known that 
a graph G is 3-colorable iff there is a homomorphism 
from G to K3, where K3 is the complete graph with 
3 vertices. We provide a reduction G <-* {^1,^2}, such 
that G is not 3-colorable if and only if ^1 < ^2- 

We identify a graph G = {V,E), where |F| = n and 
\E\ = m with the sequence 



and treat the elements in V as variables. Similarly, we 
identify the graph K3 with the sequence K3(zi, Z2, z^) = 

E{zi,Z2),E{z2,Zi),E{zi,Z3),E{z3,Zi),E{z2,Z-i),E{z3,Z2) 

where zi, Z2, and z^ are variables. With these notations, 
given a graph G = {V,E), we construct tgd's ^1 and ^2 
as follows: 

^1= R{z) ^3zi,Z2,Z3 K3{zi,Z2,Z3), and 
6 = E{x,y) -^3x1,..., Xn G{xi, . . . ,x„). 

Clearly the reduction is polynomial in the size of G. 
We will now show that ^1 < ^2 iff G is not 3-colorable. 

First, suppose that ^1 < ^2- Then there exists an 
instance / and homomorphisms hi and /12, such that 

/ ^ /i2(6)- Consider J, where / ill:^ J. Thus 
had to contain at least one tuple, and E^ had to be 
empty, because otherwise the monotonicity property of 
the chase we would imply that that J 1= h2{£,2)- 

On the other hand, we have / > J, where in- 
stance J = I u {K3{h'i{zi) , h[{z2) , h[{z3))} , and h[ is a 
distinct extension of hi. Since E^ = 0, and we assumed 
that J if ^2(^2)1 it follows that there is no homomor- 
phism from G into J, i.e. there is no homomorphism 
boinG{h'2{xi),. . . ,h'2{xn)) to K3{h[{zi) , h[{z2) , h[{z3)) , 
where h'2 is a distinct extension of /i2- Therefore the 
graph G is not 3-colorable. 

For the other direction, let us suppose that graph G 
is not 3-colorable. This means that there is no homo- 
morphism from G into K3. With these assumption let 
us consider / = {R{a)} homomorphism hi = {z/a} and 
homomorphism h2 = {x / h'i{zi) , y / h'i{z2)} ■ It is easy to 
verify that /, hi and h2 satisfy the three conditions for 

< 6- 

For part 2 of the theorem, consider the set E = {^1 , ^2} 
defined as follows: 



6 



R{zi,v) 3Z2,Z3,W K3{zi,Z2,Z3), 

R{z2,w), R{z3,w), S{w), and 
E{x,y) ^ 3a;i, . . .,Xn,vG{xi, . . . ,Xn),R{x,v). 



It is straightforward to verify that E i. \NA and that 
S.2 < Ci- Similarly to the proof of part 1, it can be shown 
that ^1 < ^2 iff the graph G is not 3-colorable. From 
this follows that E 6 CStr iff there is no cycle in the 
chase graph, iff ^1 / ^2 iff G is 3-colorable. ■ 

Note that the reduction in the previous proof can 
be used to show that the problem E e Str is NP-hard. 
Similar result can be also obtained for the IR class and 
also for the local stratification based classes introduced 
by Greco et al. in |12|. The obvious upper bound for 
the problem ^1 < ^2 is given by: 



G{xi 



i) = E{x,^,y,J,...,E{x,^,yi^J, 
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Proposition 12. Given two dependencies and ^2, 
the problem of determining whether ^1 < ^2 i-s in Sj- 

Proof: From ^ we know that if ^1 < ^2 there is 
an instance / satisfying Definition [TJ such that size of 
/ is bounded by a polynomial in the size of {^1,^2}- 
Thus, we can guess instance /, homomorphisms hi and 
/12 in NP time. Next, with a NP oracle we can check if 

1 1= ^2(6) and J if /i2(C2), where / ''^'^''^^> J. . 

We shall see that the upper bound of the proposition 
actually can be lowered to For this we need the 
following characterization theorem. 

Theorem 13. Let Ci = ^ /3i and ^2 = "2 ^ /32 be 
tgd's. Then, ^1 < ^2 if and only if there is an atom t, 
and homomorphisms hi and h2, such that the following 
hold. 

(a) tePi, 

(b) hi{t)eh2{a2), 

(c) hi{t) i hi{ai) , and 

(d) There is no idempotent homomorphism from 
h2{f32) to h2{a2) u hi{ai) u hi{l3i) . 

Proof: We first prove the "only if" direction. For 
this, suppose that ^1 < ^2: that is, there exists an in- 
stance / and homomorphisms gi and g2 , such that con- 
ditions (i) - (Hi) of Definition [1] are fulfilled. 

From conditions ii and Hi we have that gi(ai) £ / 
and g'lifSi) $ /, for any distinct extension g[ of gi. 

Now, consider hi = gi and /i2 = 52- Let t be an atom 
from Pi such that h'i{t) € n /i2(a2) and h'i{t) i 

/ii(ai), for an extension h[ of hi. Such an atom t must 
exists, since otherwise it will be that h'i{f3i) n /i2(a2) £ 
which is not possible because of conditions (i) 
and (Hi) (note that ^1 = It is now easy to see that 
t, h'l and /12 satisfy conditions (a), (6), and (c) of the 
theorem. It remains to show that condition (d) also is 
satisfied. By construction we have J = Iuh'i{f3i). It now 
follows that I u h'i{l3i) if ^2('?2)- Because /ii(ai) £ /, 
condition (d) is indeed satisfied. 

For the "if" direction of the theorem, suppose that 
there exists an atom t and homomorphisms hi and ft.2, 
such that conditions (a),(6),(c) and (d) holds. Let 
9i = hi, g2 = /i2 and let / = {hi{ai) u /i2(a2)) n h'i{t), 
for a distinct extension h'l of hi. Because h'i{t) i I 
and h[{t) e h2{a2), it follows that /i2(q!2) $ I- Thus 
we have / 1= h2{£,2), proving point (i) of Definition [ij 
On the other hand, because point (c) of the theorem is 
assumed, it follows that hi{ai) E /, from which we get 

/ J, where J = I u h'i(l3i), proving points (i) 

and (ii) from Definition [I] Since I u hi{(3i) = hi{ai) u 
/i2(q!2)u/i'j(/3i), and point (d) holds, we get J h2{S,2), 



thus showing that condition (Hi) of Definition [T] is also 
satisfied. ■ 



It is easy to note that by adding the extra condition 
(e) there is no idempotent homomorphism from j3i to 
ai in the previous theorem we obtain a characterization 
of the stratification order associated with the Str class. 

With this characterization result we can now tighten 
the S2 upper bound of Proposition 
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as follows: 



Theorem 14. Given two dependencies ^i and ^2, the 
problem of determining whether ^1 < ^2 is in A^. 



For_this proof we will use the characteriza- 

pNP ^ 

E algorithm that 



Proof: 

tion Theorem 

P"^P. ConsideF the following PTI 
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and the observation that A2 ■ 



enumerates all possible /ii,/i2 and t: 

1 for t 6 /3i 

2 do for all (/ii,/i2) = mgu(i,Q;2) 

3 do if hi{t) i hi{ai) return t,hi,h2 

In the algorithm, mgu(t,a2) denotes all pairs (/ii,/i2) 
such that there exists an atom t' e a2, with hi{t) = 
h2{t'), and there is no (51,52) and / different from the 
identity mappings, such that hi = gio f and /12 = 52 ° /• 
Using the values returned by previous algorithm and 
with a coNP oracle we can test if point (d) holds. Thus, 
the problem is in Aj. ■ 

Armed with these results we can now state the upper- 
bound for the complexity of the CStr membership prob- 
lem. 

Theorem 15. Let T, be a set of tgd's, then the prob- 
lem of testing if e CStr is in II2. 

Proof: (Sketch) To prove that E is not in CStr guess 
a set of tuples {^i,t^ ,h\,hl), . . .,(^fe,t'',/i5^,/i2)> where 
^1, . . . are tgd's in E, ti, . . . ,tk are atoms, and h\,h2 
are homomorphisms, for i e {1, . . . ,k}. Then, using an 
NP oracle check that £,i < ^i+i, for i e {1, . . . , fc - 1}, and 
< : using the characterization Theorem [13] with 
t*, h\, h\ and t^ , hi, h2 respectively. And then check 
in PTI ME if the set of dependencies {^1, . . . ,^fe} is not 
weakly acyclic. Thus, the complexity is II2 . ■ 



We note that using the obvious upper-bound T,^ for 
testing if ^1 < ^2, the membership problem for the class 
CStr would be in Ilg. As mentioned the same results 
apply also for the class Str. Even if the complexity 
bounds for testing if ^1 < ^2 are not tight, it can be 
noted that a coNP upper bound would not lower the 
nf upper bound of the membership problem for CStr. 
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7. CONCLUSIONS 

We have undertaken a systematization of the some- 
what heterogeneous area of the chase. Our analysis 
produced a taxonomy of the various chase versions and 
their termination properties, showing that the main suf- 
ficient classes that guarantee termination for the stan- 
dard chase also ensures termination for the complexity- 
wise less expensive semi-oblivious chase. Even if the 
standard chase procedure in general captures more sets 
of dependencies that ensure the chase termination than 
the semi-oblivious chase we argue that for most prac- 
tical constraints the semi-oblivious chase is a better 
choice. We have also proved that the membership prob- 
lem for the classes CTy°y^ and CTy^ is coRE-complete 
and in case we allow also at least one denial constraint 
the same holds for CTyy, CTyy' and CT°v- Still it re- 
mains an if the membership problem for CTyy remains 
coRE-complete without denial constraints. The same 
also holds for the classes CTy°y' and CT°y. Finally we 
have analyzed the complexity of the membership prob- 
lem for the class of stratified sets of dependencies. Our 
bounds for this class are not tight, and it remains an 
open problem to pinpoint the complexity exactly. 
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APPENDIX 

Note that the theorems presented in the paper we kept 
the same numbering and the new theorems/lemmas are 
numbered continuously. 

4. Chase termination questions 

Theorem 2. 

CTobl _ I — robi i — i-sobi _ / — rsobi i"-pstd t — i-std 
VV ~ ^ ' V3 ^ ' VV ~ ^ ' V3 ^ ' VV ^'V3- 

Proof: (Sketch) We will only show the strict inclu- 
sion parts of the theorem. For the first inclusion, let 
E = {R{x,y) 3z R{x,z)}. It is easy to see that 
E 6 CTy°y' and E ^ CTy^'. The second strict inclusion 
CTy°y ' c CTyy Is more intricate, as most sets of depen- 
dencies in CTyy part of CTy°y'. To distinguish the two 
classes, let E = {^1,^2}, where 

^1 = R{x) ^ 3z S{z),T{z,x), and 
^2 = S{x) ^ 3z' R{z'),T{x, z'). 

Let now / be an arbitrary instance, and suppose that 
/ = {i?(ai),...,i?(a„),5(5i),...,5(6„0}. There is no 
loss of generality, since if the standard chase with E on / 
terminates, then it will also terminate even if the initial 
instance contains atoms over the relational symbol T. 
It is easy to see that all standard chase sequences with 
E on / will terminate in the instance Au B, where 

^ = U {R{ai),S{z^),T{zi,a^)}, and 

ie{l,...,n} 

B = U {R{zl),S{b,),T{b,,z[)}. 

i<e{l,...,7n} 

On the other hand, all semi-oblivious chase sequences 
will converge only at the infinite instance 



U {CkUDk)uAuB, 

fceN 



where 



Cfc - U{'S'(^fen + (fc-l)m + i)i J^(^fcn+(fc-l)m+ii2(fc-l)m+(fc-l)n+i)}'-' 
i-1 

n 

J = l 
n 

Dk = U{-^(^fcm+(fc-l)n+i)'^(^(fc-l)n+(fc-l)m+i.^fcm+(fc-l)n+i)}'-' 
i-1 

m 

km+kn+j 

J = l 

For the last inclusion CTyy c CTy^, consider the 
set E = {R{x,y) Riy,y); R{x,y) ^ 3z R{y,z)}. 
Clearly E 6 CTy 3 , because for any instance all chase se- 
quences that start by firing the first tgd will terminate. 
On the other hand E ^ CTyy, as the standard chase 



with E docs not terminate on J = {R{a,b)} whenever 
the second tgd is applied first. > 

4.2 Undecidability of termination 

Hithereto, the following results have been obtained. 
Theorem 5. 

1. CT^y and CT^^ are RE-complete [6j. 

2. CT™y = CTj°2^, and both sets are RE-complete joj. 

3. CTj°y = CTj°2', and both sets are RE-complete [l7| . 

4- Let E &e a set of guarded tgd's [4]. Then the ques- 
tion E 6 CTj°y is decidable [13| . 

Before describing the reduction and the proofs for 
theorems |6j [7] and |8] let us first give a brief description 
for the word-rewriting systems. 

Word rewriting systems. Let A be a finite set of 
symbols, denoted a, 6, . . ., possibly subscripted, and A* 
the set of all finite words over A. Let O be a finite subset 
of A*x A* . Treating each pair in O as a rule, the relation 
O gives rise to a rewriting relation -^q c A* x A* 
defined as {{u,v) ■ u = x£y, v = xry, (£, r) e 0}. 

We use the notation u ->-q v instead of {u^v). If 
O is understood from the context we will simply write 
u ^ V. If we want to emphasize which rule p e Q was 
used we write u ->p v. When u -^p v we say that v is 
obtained from u by a rewriting step (based on p). By 
u V we mean that v can be obtained from u in at 
most n rewriting steps. A rewriting system is then a 
pair (A*, 8). If A is understood from the context we 
shall denote a rewriting system simply with 0. 

A sequence wo,wi,W2, ■ ■ ■ of words from A* is said 
to be a Q- derivation sequence (or simply a derivation 
sequence), if Wi Wi+i for all i = 0, 1, 2, . . .. A deriva- 
tion sequence might be finite or infinite. The termina- 
tion problem for and a word wq e A*, is to determine 
whether all derivation sequences wo,wi,W2, ■ ■ ■ originat- 
ing from wq are finite. The uniform termination prob- 
lem for is to determine whether for all words wq e A* , 
it holds that all derivation sequences Wq,Wi,W2 ■ ■ ■ orig- 
inating from Wq are finite. It has long been known that 
the termination problem is RE-complete |5j, and that 



the uniform termination problem is coRE-complete 15 



We now describe our reduction 1-^ Ee. We assume 
without loss of generafity that A = {0,1}. The tgd 
set Se is defined for schema Re = {E,E* ,L,R, D} and 
consists of {^p-pe 0}u{,^l„ , , , Cr,}uTCuADuS, 
where S.p is: 

E(xo,ai,xi), . . . ,E(xn-i,a„,Xn) ^ 3 yo, . . . ,ym L(xo,yo), 

E(,yo,bi,yi), ■ ■ ■ ,E(ym-l,bm,ym),R(xn,ym)■ 
whe^!l p = (oi ... a„, 61 ... 6m)- We will also have the 

following "grid creation" rules, using left L and right R 

predicates. 
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= E{xo,0,xi),L{xi,yi) ^ 3 yo L{xo,yo), E{yo,0,yi) 

^Li = E(xo,l,xi),L(xi,yi) ^ 3 yo L(xo,yo),E{xo,l,yi) 

?Ho = R(xo,zo),E(xo,0,xi) ^ 3 zi E(zo,0,zi),R(xi,zi) 

= R(xQ,zo),E(xo,l,xi) ^ 3 zi E(zo,l,zi),R(xi,zi) 

In the sequel we will assume, unless otherwise stated, 
that all instances are over schema Re- For an instance / 
the following rule set AD ("active domain") computes 
dom{I) u A in relation D. 

- D(0),L>(1) 

E(x,z,y) ^ D{x),D{z),D{y) 

L(x,y) ^ D{x),D(y) 

R(x,y) ^ D(x),D(y) 

E*{x,y) ^ D{x),D{y) 

Given an instance / we denote by Gj the graph with 
edge set: 

{{x,y) : E{x, z,y) £ I or E* {x,y) £ I or L{x,y) e I OT R{x,y) £ I}. 

The following set TC computes in E* the transitive 
closure of Gj: 

E(x,z,y) ~* E*{x,y) 
L(x,y) -> E*(x,y) 
R{x,y) -* E*{x,y) 
E*(x,y),E*(y,z) E*{x,z) 

If an instance has a cycle in E* , that is there exists 
a cycle in G/, the dependencies in the following "satu- 
ration" set S will be fired: 

E*{v,v),D{x),D{z),D{y) ^ E{x,z,y) 

E*{v,v),D(x),D(y) ^ L(x,y), R(x,y), E* (x,y) 

In the sequel, we shall sometimes say that an instance 
/ is cyclic (acyclic) if G/ is cyclic (acychc). We shall 
also speak of "the graph of /", when we mean G/. 

We denote by Hj the Herbrand base of instance /, i.e. 
the instance where, for each relation symbol R € Rq, 
the interpretation R^' contains all tuples (of appro- 
priate arity) that can be formed from the constants in 
{do'm{I) n Cons) u A. The proof of the following lemma 
is straightforward: 

Lemma 1. core{I) = Hi, whenever Hi is a subin- 
stance of I . 



Lemma 2. Let I be an arbitrary instance over schema 
Re, and let I = Iq,Ii,I2,- ■ ■ be the core chase sequence 
with Se on I . If there is an integer i and a constant or 
variable x, such that E*{x^x) e I^ (i.e. the graph Gi. is 
cyclic for some j < i), then the core chase sequence is 
finite. 

Proof: (Sketch) First we note that Hi^ = Hi for any 
instance Ik in the core chase sequence, since the chase 
does not add any new constants. If the core chase does 
not terminate at the instance li mentioned in the claim, 
if follows that the dependencies in the set S will fire at li 
and generate Hi as a subinstance. It then follows from 
Lemma[l]that J^+i = Hi. It is easy to see that Hi 1= Se, 
so the core chase will terminate at instance li+i- ■ 

Intuitively the previous lemma guarantees that when- 
ever we have a cycle in the initial instance the core chase 
process will terminate. Thus, in the following we will 
not have to care about instances that may contain cy- 
cles. 

The following lemma ensures that the core chase with 
Se on an acyclic instance will not create any cycles. 

Lemma 3. Let I be an arbitrary acyclic instance over 
schema Re, and let I = /o,/i,/2, ■ ■ . be the core chase 
sequence with Se on I. Then Gi- is acyclic, for all 
instances li in the sequence. 

Proof: (Sketch) Suppose to the contrary that G/^ is 
cyclic, for some li in the sequence. Wlog we assume that 
li is the first such instance in the sequence. Clearly i>l. 
This means that by applying all active triggers on 
will add a cycle (note that the taking the core cannot 
add a cycle). Let (^i, /ii), . . . , (^„, /i„) be the triggers 
that add tuples to li, causing G/. to be cyclic. First, it 
is easy to see that {Clq , , > ^fli } ^ {Ci ' ■ • ■ > C™} = 0- 
This is because these dependencies do not introduce any 
new edges in G/. between vertices in G/. j^, they only 
add a new vertex into G/. which will have two incoming 
edges from vertices already in G/._j . A similar reasoning 
shows that none of the e TC or ^ e AD can be part of 
the set {^1, . . . , ^„}. Finally, the dependencies in the set 
S may introduce cycles and may thus be part of the set 
{^1, . . . ,fn}. But the dependencies in S are fired only 
when E*(x,x) e li-i, which means that G/._j already 
contains a cycle, namely the self-loop on x. Contradicts 
our counter assumption that li is the first instance in 
the chase sequence that contains a cycle. ■ 

We still need a few more notions. A path ir of an 
instance / over Rq is a set 

{E{xo,ai,xi),E{xi,a2,X2),- ■ . ,£'(x„_i,a„,a;„)} 

of atoms of /, such that {ai, 02, . . . , a„} £ A (recall that 
A is the alphabet of the rewriting system O). The word 
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spelled by the path tt is 

word{'K) = aia2 . . . a„. 

A max-path tt in an instance / is a path, such that 
no path tt' in / is a strict superset of tt. We can now 
relate words and instances as follows: let / be an acyclic 
instance, we define 

paths{I) = {tt : TT is a max-path in /}. 

Clearly paths{I) is finite, for any finite instance /. Con- 
versely, let w = aia2 . . . a„ € A*. We define 

Ini = {E{xo,ai,xi),E{xi,a2,X2), . . . , £^(x„_i, a„, a;„)}, 

where the x^'s are pairwise distinct variables. Clearly 
paths{Iw) = {tt}, where word{TT) = w. 

Lemma 4. Let w e A* , and let = Iq, Ii, I2, ■ ■ ■ be 
the core chase sequence with Se on For each in- 
stance Ii in the sequence, denote by I[ the instance 
obtained from Ii by firing all active triggers, that is 
li+i = core{I'i). Then paths{I[) = paths{Ii+i) . 

Proof: (Sketch) If there were a path tt, such that 
TT 6 paths{I'i) \ paths{core{I[)) , there would have to be 
atoms of the form L{x, x) and x) in the instance I[, 
which would contradict Lemma [3] as 1^ , by definition, 
does not contain any cycles. . 

In order to be able to relate rewrite sequences and 
core chase sequences we introduce rewrite trees and 
path trees. For a rewriting system Q and word w € A*, 
we construct the rewrite tree Tw inductively. Start with 
a root node labelled w. Then, for each leaf node n in 
Tw, for each possible derivation v ^ u, where v is the 
label of n, add a new node m as a child of n, and label 
m with u. 

The next lemma follows directly from the construc- 
tion of Tw 

Lemma 5. Let w e A*. Then Tw has an infinite 
branch if and only if there is an infinite rewriting deriva- 
tion w = Wq,Wi,W2, ■ ■ ■ generated by Q from w. 

We next define the path tree Vw of the core chase 
sequence Iw = Iq, Ii, h, ■ ■ ■ generated by Se from 
The path tree is defined inductively on the levels of the 
tree. First, let Vw consist of a single node, labelled with 
the single path in paths{Iw)- Inductively, for each leaf 
node n in Vw, where tt is the label of n, 

1. for each = a -> /3 € Sq if there is a homomor- 
phism h, such that the trigger {^p,h) is active on 
Ii and h{a) £ tt, then add a child m labelled with 
the unique path in paths{h'{/3)), where h' is the 
distinct extension of h. 



2. for each pair {£,l,^r), say (^Lo,Cfli), if there is a 
homomorphism h, such that (Clo>^) active on 
Ii (and/or {£,Rj^,h) is active on 7^), and tt contains 
E{h{xo),0, h{xi)) (and/or E{h(xo), 1, h{xi)) € tt), 
let h' be the distinct extension of h, and add a child 
node m, labelled with the unique path in the set 

paths {Eih'iyo),0,h'iyi))uTT) 
(labelled with the unique path in 
paths{Eih'iyo),0,h'{yi))uTTuEih{xo),l,hizi))) 
or with the unique path in 

paths (tt u E{h'{zo), 1, /i'(zi))), 
respectively). 
Similarly to Lemma [5) we have 

Lemma 6. Let w e A*. Then Vw is infinite if and 
only if the core chase sequence Iw = lo, Ii, I2, ■ ■ ■ on Iw 
with Y.Q is infinite. 

We can now state the following important theorem. 

Theorem 16. Let w e A* . Then the core chase se- 
quence Iw = Iq, Ii, I2, ■ ■ ■ with Ee on Iw is infinite if and 
only if there is an infinite derivation w = Wq,Wi,W2, ■ ■ ■ 
generated by O. 

Proof: (Sketch) For the if part, suppose that there is 
an infinite derivation w = Wo,Wi,W2, ■ ■ ■■ From Lemma 
[5] it follows that Tw has an infinite branch and by con- 
struction the branch is labelled by the derivation se- 
quence w = wq,wi,W2, ■ ■ ■ generated by Q. We claim 
that there is an infinite branch in Vw labelled with 
ttq,tti,tt2, ■ . ., and a sequence of indices = jo < ji < 
j2 < such that word{TTj.) = Wi, for all i = 0, 1,2, . . .. 
Clearly word{TTo) = wq- For the inductive hypothe- 
sis fix n, and suppose that word{TTj.) = Wi, for all 
i = 0,1,..., n. Let Wn = x£y and Wn+i = xry. Also, 
let k = max{\x\,\y\}. It follows from the inductive hy- 
pothesis and the construction of Vw , that the branch la- 
belled 

""io ; • ■ ■ : ""ii , ■ ■ ■ , ""in , where word{TTj. ) - Wi, con- 
tinues with nodes labelled ttj^+i, TTj^+2, ■ ■ ■ , T^j^+k, where 
word{TTj^+k) = Wn+i- Since Vw thus has an infinite 
branch, Lemma|6]tells us that the core chase sequence is 
infinite as well. Figure 1 shows the relationship between 
Vw and Tw where Q = {(0, 1)} and the initial word wq 
is 1101. 

For the other direction, suppose that the core chase 
sequence is infinite. From Lemma [6] it follows that 
Vw has an infinite branch. Let this branch be labelled 
ttq,tti,tt2, ■ ■ ■■ We claim that there is an infinite deriva- 
tion w = wo,wi,W2, ■ ■ ■ generated by Q, and a sequence 
= Jo < ji < ^2 < •" of indices, such that word{TTj.) = Wi, 
for all i = 0,1,2,.... This can be seen by choosing 
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= ji + k, where 7Tj-+k is the first label on the path m 
Vw, such that tTj . ^ i^ji+k- ■ 

This theorem together with the RE-completeness re- 
sult for rewriting termination [5] , yields the undecidabil- 
ity result of Deutsch et al. [6] for core chase termination 
on a given instance. 

Corollary 3. The set CT"y is R£- complete. 

Next we shall relate the uniform termination problem 
with the set CTy°y^. This means that we need to consider 
arbitrary instances, not just instances of the form 
for w 6 A*. First, we introduce a few more notations. 

We denote by Sp the set of all dependencies in Eq. 
Likewise, Slj?, will denote the set {^Lo-,£.Li,£.Ro-,£.Ri} ■ 

Let / be an acyclic instance, and /' = Chase^'^^ (/). 
Where C hase'^'^ (1) represents the instance returned by 
the core chase procedure on / with S. It is easy to see 
that /' is finite for any finite acyclic instance /. Then 
let /* = [J{Iword{Tr) '■ 1^ ^ paths{I')}. Intuitively, /* is 
obtained from / by by taking each max-path in /', and 
making it into a unique line tree in G/» . 

Lemma 7. Let I be an arbitrary acyclic instance. Then, 
the core chase of I with T,q terminates if and only if the 
core chase of Chase-s,^j^{I) with Sq terminates. 

Proof: Let / = Jq , /i , . . . be the core chase sequence 
of / with Ee and Chases: ^^^{I) = Jq, Ji, . . . be the core 
chase sequence of Chase-s:^^{I) with Ee- Let us first 
suppose that the core chase sequence on Chase^^j^{I) 
does not terminate. In this case it is easy to see that 
that there must be an integer i such that Chase-Sj^i^il) £ 
li, but from this follows that the core chase for / with 
Ee does not terminate either. The other direction fol- 
lows directly from the observation that for any i we have 
h £ J^• ■ 

Lemma 8. Let I be an arbitrary acyclic instance such 
that I 1= E/^/j. Then the core chase of I with Eq is 



infinite if and only if the core chase of L* with Ee is 
infinite. 

Proof: (Sketch) Let I = If), Ii, I2, ■ . . be the core chase 
sequence of / with Ee. And let /* = Jq, Ji, . . . be the 
core chase sequence of /* with Ee. Suppose that the 
sequence Iq, Ii, I2, ■ ■ . is infinite. 

We will prove by induction that for each i there exists 
a j such that for each path tt 6 paths{Ii) there exists a 
unique path tt' e paths{Jj) and word{'!T) is a factor of 
word{n'). This proving the if part of the lemma. 

For the base case, let ? = and consider j = 0. By the 
definition /* contains all the paths in paths{I). For 
the inductive step let us suppose that for a fixed i it 
holds that for any integer k < i there exists jk such 
that for each path tt € paths{Ik) there exists a unique 
path tt' 6 paths{Jj^) such that word{Tr) is a factor of 
word{n'). For each path tt in tt e paths{Ii+i)\paths{Ii) 
wc will assign a unique path tt' e paths{Jj) \ paths{Jj.) 
for some j > ji by considering the following 2 cases: 

Case 1. TT was created by applying a ^p dependency 
for some p = (ai . . . a„, &i . . . 6m) e ©• In this case 
it needs to be that there exists ttq e paths{Ii) such 
that ai...a„ is a factor of word{TTQ). From the in- 
duction hypothesis it follows that there exists a j and a 
unique ttq e paths{Jj) such that word{T:o) is a factor of 
word^TT'o). By transitivity of word factor relation it fol- 
lows that fli . . . a„ is also a factor of word^TT'o). But this 
means that the same dependency ^p can be applied, or 
was already applied, for following that paths{Ij+i) 
contains the path n' such that word{TT) = 61 ... 6m is a 
factor of word{'K'). 

Case 2. tt was created by extending path tti e paths{Ii) 
using one or two dependencies from E^^. Because of 
the assumption / 1= E^j^ it follows that there must ex- 
ist a subpath 1^2 of tti (note that such a subpath is 
unique) such that 1^2 was obtained from a ^p depen- 
dency applied to path tts 6 paths{Ik), where k < i and 
p = (ai . . . a„, 61 ... 5m). Thus, word{'K2) = 61 ... 6m and 
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word^TTs) = woai...a„ui, for some words vq and vi. 
Because the core computation does not shrink existing 
paths it foUows that there must be a path e paths{Ii) 
such that 773 is a subpath of 774, thus the speUing for 774 
wiU be word{Tr4) = UQVoai . . . a„wiui, for some words uq 
and ui . Based on the acycUc requirement for /, Lemma 
[3) the definition for the S^ij dependencies and the ob- 
servation that applying a dependencies will always 
create a new path it follows that word{Tr) is a factor of 
the word wqWo^i • ■ • bmViUi . From the induction hypoth- 
esis we have that for word{-K4) = UQVQai . . . UnViUi there 
exists a j and a unique path tt' e paths{Jj) such that 
word('K4) is a factor of word{'K'). This means that the 
core chase process can applied (or already applied) trig- 
ger (^p, h) that mapped the body of ^p to the path given 
by word oi . . . a„ in word{Tr4) = woWoQ-i • ■ ■ anViUi. By 
applying this trigger instance Jj+i will contain the path 
tt'i that is a factor of uo'^o^i • • ■ bmViUi. After applying 
a maximum of maxduQVQl, \viUi\) core chase steps (i.e. 
applying the copying dependencies) it follows that in- 
stance Jj+i+max(\uovol\viui\) will coutaiu a path tt' such 
that UQVobi . . . bmViUi is a factor of word(TT'). Note that 
the assignment of tt to 7r' is unique due to the unique- 
ness of the trigger applied. Also by the transitivity of 
the factor relation it follows that word(7r) is also a fac- 
tor of word('!T'). 

For the only if direction let us suppose that the core 
chase of /* with Ee does not terminate. Because all the 
paths in paths{I*) does not share any node in common, 
it follows that there must be a path tt in paths{I*) 
such that the core chase of luiord(it) does not termi- 
nate. Clearly from the definition of /* it follows that 
in the core chase sequence / = Iq, Ii, I2, ■ ■ ■ there ex- 
ists and integer i and there exists path tt' e paths{Ii) 
with word{TT') = word^TT). This means that for the in- 
stance corresponding with the path tt' the core chase 
will follow core chase steps isomorphic with the once 
used when chasing instance Iw, where w = word{TT). 
Thus, the core chase for tt' will not terminate either, g 

We can now state the following important result 

Theorem 17. A reduction system O uniformly ter- 
minates if and only if the core chase terminates on all 
instances for T,q (i.e. Ee e CTy°y^j. 

Proof: (Sketch) First let us suppose that Eg e CTy°y^. 
Let w 6 A* be an arbitrary word. Because Ee e CTy°y^ 
it follows that the core chase will terminate also with 
instance Iw From this and Theorem |16| it follows that 
the rewriting system @ will terminate for w. 

For the other direction suppose that Ee i CTy°y^. 
Then there exists an instance /, such that the core chase 
sequence / = Io,Ii,l2,... with Ee is infinite. From 
Lemma |2] it must be that / is acyclic. From Lemma [T] 
and Lemma [S] it follows that the core chase of /* with 
Eq must be infinite as well. But then there must be 



path TT e I* such that admits an infinite derivation 
starting from word{TT). But this means that O is not 
uniformly terminating, h 

Using the previous result and the coRE-completeness 
result of Huet and Lankford 
theorem. 



15 , we now have the main 



Theorem 6. The membership problem for CT 
coRE-complete. 
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Up to here we proved the undecidability of the CTy°y^ 
class. Next we will show that this result can be extended 
with some minor changes to other termination classes 
as well. 

Theorem 7. The membership problem for CTy^ is 
coRE-complete. 

Proof: (Sketch) It is easy to see that the same Ee 
reduction works for the CTy^ case as well by choos- 
ing the branch that first applies all the AD u TC u 5* 
dependencies. This is because in case the initial ar- 
bitrary instance contains a cycle the full dependencies 
ADliTCli S will saturate the instance and the standard 
chase will terminate. If / does not contain any cycles 
then, as we showed, during the chase process no cycles 
are added and the termination proof is the same as for 
the core chase. ■ 

To show that the basic Ee reduction can't be used 
for the CTyy class. Consider the word-reduction sys- 
tem 8 = {(1, 0)} and instance / = {E{a, 0, a),L{a, b)}. 
It is easy to see that the branch that applies the £,Lo 
dependency first will not terminate as it will generate 
the following infinite set of tuples: 

E L 



a a 

xi b 

X2 xi 

Xji Xji—\ 



a b 

a xi 

a X2 

a Xn. 



On the other hand, it is clear that the reduction sys- 
tem in uniformly terminating. 

The undecidability result can still be obtained for the 
CTyy class if we allow denial constraints. Then we sim- 
ply define E^ = {i;*(a;,a;)^l}u(E0\S'). 

Theorem 8. Let Yi be a set of tgd's and one denial 
constraint. The the membership problem E e CTyy is 
coRE-complete. 

Proof: (Sketch) Similarly to the proof of Theorem [t] 
it is easy to see that if an arbitrary instance / contains 
a cycle, then the standard chase on / with Eq will ter- 
minate on all branches. This is because the fairness 
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conditions guarantees that the denial constraint will be 
fired, and the chase will terminate. ^ 

Finally, we note that using the same Sq reduction 
can be shown that the classes CTy°y' and CT^y are also 
coRE-complete. 

5. Guaranteed Termination 

The following Hasse diagram summarizes the stratifica- 
tion based classes and their termination properties. 



>"TCore <"Tcore 




RA 
I 

SW 

Figure 2: Sufficient classes. 
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